Dr. John Snow & the 1854 Cholera “Ghost Map”

Dr. John Snow

Dr. John Snow (1813–1858) was a British doctor whose brilliant analytical skills during the cholera epidemic in Soho, London, turned public health around forever. He doubted the orthodox "miasma" theory which attributed the disease to bad air and, on the contrary, gathered evidence to prove the radical idea — the spread of cholera was caused by the water contaminated. With the help of very precise records and a self-drawn map, Snow traced the cholera victims' addresses and the places of public water. This spatial reasoning unveiled a very strong pattern — the majority of deaths were concentrated around one pump on Broad Street.

Original Cholera Map

Snow's original “Ghost Map” was one of the most data visualizations that shaped history the most. Those black columns on the map were showing cholera deaths and were making very thick bunches around the Broad Street pump. By combining geographical information with medical knowledge, Snow produced very strong evidence that water, not air, was the spread of the disease. The pump handle was removed by the officials on 8 Sep 1854. Cases declined rapidly — this was a moment that represented the beginning of modern epidimeology and public health intervention through evidence.

On 8 September 1854, authorities removed the pump handle. New cases plummeted almost immediately — a moment that symbolized the birth of modern epidemiology and evidence-based public health intervention.

Cholera Outbreak in Soho, 1854

Cholera Outbreak, Soho 1854
Depiction of the 1854 cholera outbreak in Soho, London

In the autumn of 1854, the city of Soho in London was not prepared for the wave of cholera that would soon exhaust its health resources and decimate the population. People died very quickly within hours of falling sick, the streets were deserted and panic spread faster than the disease itself. The common explanation was "bad air"; however, it was a mystery why certain streets were healthy while others were sick.

Doctor John Snow literally went from house to house, marking each death and putting one question: "Where do you get your water?" When he drew the answers on a map, it gave the impression—deaths were concentrated near the location of the Broad Street pump. The pump handle was taken off on September 8, 1854, and the disease soon came to an end—this was the first victory of data, mapping, and public health practice.

Summary — The 1854 Cholera Outbreak in Soho
📍 LocationSoho, London
🗓️ Year1854
⚰️ Deaths~600 in ~10 days
💡 InsightWaterborne transmission

Introduction

The cholera outbreak of 1854 in Soho, London, is a landmark event in both the history of data visualization and epidemiology. The map of cholera deaths drawn by Dr. John Snow revealed a very strong spatial concentration around the Broad Street water pump, thus completely overturning the “miasma” theory and opening up a new area of scientific inquiry into disease patterns.

The main idea behind this redesign is to reinterpret Snow’s map by making use of modern interactive and temporal visualization technologies. The use of Python libraries like Pandas, Folium, and HeatMapWithTime allows the project to give movement and interactivity to the original static visualization, thus making it possible for the users to dynamically traverse the outbreak through both space and time.

With the addition of data-driven storytelling and animation, the redesigned version is projected to not only scale up user engagement but also illustrate the epidemic’s travel and connect historical understanding with current analytical power. This modern take both pays tribute to Snow’s groundbreaking efforts and demonstrates that interactive visualization can be a powerful ally in comprehending the complexities of epidemiological occurrences in our time.

Objective¶

This redesign project aimed to not just reimagine John Snow’s 1854 cholera map, but also to provide an analytical and educational tool by the incorporation of modern data exploration tools through contemporary and interactive visualization methods. The intent was to rejuvenate and reproduce the original map not merely by its boundaries but by revealing its full potential through the newly available means.

Process & Iterations¶

Iteration 1 – Concept Sketch¶

During the first cycle, I created a draft drawing that was influenced by John Snow's original cholera map. Using Window Painting, I marked coloured areas representing the density of deaths near water pump sites. This visual approach helped to illustrate the intensity of infection and spatial relationships between cases and pump locations.

No description has been provided for this image

Figure 1. Iteration 1 — Concept Sketch
Snow’s cholera map redesign showing pumps, death points, highlighted streets, heat map boundary, and before/after death rate analysis.

Iteration 2 – Analytical Prototype¶

The concept sketch served as a solid foundation for the next step, where I developed an analytical prototype using Python programming within Jupyter Notebook. At this stage, the visual concepts were transformed into a digital format, enabling the dynamic representation of data through coordinate plotting and early-stage interactivity. This prototype marked the transition from a static conceptual idea to an interactive analytical framework, establishing the groundwork for subsequent visual iterations.

No street column detected; Top-5 streets layer skipped.

Timeline window: 1854-08-19 → 1854-09-30
Before 8 Sep 1854 — total: 470, avg/day: 23.5
After  8 Sep 1854 — total: 146,  avg/day: 6.35
Death-rate reduction after handle removal: 73.0%

Saved interactive HTML → images/iteration2_map_1854-08-19_to_1854-09-30.html
Out[9]:
Make this Notebook Trusted to load map: File -> Trust Notebook

Iteration 3 – Interactive Maps, Informative Charts and Tables¶

In this phase, I developed a fully interactive visualization environment using Python. The redesigned map and analytical charts incorporated timeline sliders, correlation analyses, kernel density heatmaps, and forecast models, allowing users to explore the data dynamically.

Additionally, a short data-driven storyline was introduced to enhance comprehension of the spatial and temporal patterns of cholera cases, helping users visualize how the outbreak evolved and eventually subsided after key interventions.

Table 1 - Deaths Near Pumps (≈130 m radius) Table¶

This table gives a short summary of the cholera deaths near every water pump which were about 130 meters away from the pumps. For each pump entry, the pump's location, the total deaths reported around the pump and the portion of deaths to the total fatalities are displayed. The higher nearby death count at the pumps is likely a sign of primary contamination sources, which agrees with John Snow’s finding that the Broad Street pump was one of the major outbreak centers in 1854. The total row at the bottom is the overall death toll that was considered in this analysis.

 Combined Data — Deaths Around Each Pump (Including Total):

  Pump Name Latitude Longitude Nearby Deaths % of Total
0 Broad St. 51.513341 -0.136668 192 39.3%
1 Crown Chapel 51.513876 -0.139586 42 8.6%
2 Briddle St. 51.511542 -0.135919 40 8.2%
3 So Soho 51.512139 -0.133594 13 2.7%
4 Warwick 51.511295 -0.138199 10 2.0%
5 Gt Marlborough 51.514906 -0.139671 0 0.0%
6 Dean St. 51.512354 -0.131630 0 0.0%
7 Coventry St. 51.510019 -0.133962 0 0.0%
8 TOTAL 297 100.0%

Figure 2 – Daily Cholera Deaths with 7-Day Moving Average Graph¶

Below line Chart represents the number of cholera deaths daily in Soho during the 1854 epidemic period. The upper curve gives the daily deaths, and the lower one provides a seven-days moving average that brings out the general trend of the outbreak. The dotted vertical line indicates 8 September 1854, the date the Broad Street pump handle was taken off. The death toll quickly went down after this, proving that John Snow's method was successful in preventing the epidemic.

No description has been provided for this image

Figure 3 - Daily Cholera Attacks with 7-Day Moving Average Graph¶

This chart displays the daily reported cases of cholera during the Soho epidemic of 1854. The primary line indicates new daily cholera case counts, while the smoother line shows the 7-day moving average to uncover general infection trends. The vertical dotted line marks 8 September 1854, when the Broad Street pump handle was removed. After that date, there was a significant drop in the number of infected cases, demonstrating the effectiveness of removing the contaminated water source.

No description has been provided for this image

Figure 4 - Graph of Cumulative Cholera Deaths Over the Years Graph¶

The graph presents the total deaths from cholera in Soho throughout the 1854 epidemic. The line increases consistently, representing the gradual accumulation of deaths. The figure marks the start and stop of the time under observation, highlighting how quickly deaths rose at first and later slowed down after the Broad Street pump handle was removed. This visualization clearly shows the epidemic’s rise and eventual decline.

No description has been provided for this image

Figure 5 - Total Deaths Before and After Pump Handle Removal Graph¶

The given illustration contrasts the complete tally of cholera fatalities reported prior to and after the infamous date of 8 September 1854, when the handle of the Broad Street pump was taken off. The graph unmistakably reveals a drastic drop in deaths consequent to this measure, thereby confirming John Snow's hypothesis that the water pump responsible for the outbreak was indeed the contaminated one.

No description has been provided for this image

Figure 6 - Total Cholera Attacks Before and After Pump Handle Removal¶

So to put it in simpler words, the event happened where the Broad Street pump handle was removed on 8 September 1854.
The figure shows the total cholera cases (infections) that occurred before and after this intervention.
Dr. John Snow’s theory that stopping the infected water supply would quickly and effectively halt the disease spread
was confirmed by the sharp decline in case numbers following the removal of the pump handle.

No description has been provided for this image

Figure 7 - Deaths Aggregated by Nearest Pump Graph¶

The drawing provides an overview of the cholera deaths that were registered close to the water pumps during the 1854 Soho outbreak. The length of each bar corresponding to a pump indicates the number of deaths that occurred in its vicinity. The pump with the highest death toll is assumed to be the most infected source — notably the Broad Street pump — which was highly significant in the distribution of the disease according to John Snow’s research.

No description has been provided for this image

Figure 8 - Deaths vs Distance from Broad Street Pump (50 m bins)¶

The figure above, showing the relationship between deaths and distance from the Broad Street pump, is a graphic representation of the change in cholera deaths during the 1854 Soho outbreak with distance from the pump.

Each dot indicates the death toll within a radius of 50 meters from the pump.

The negative slope of the scatter points and the curve indicates that the majority of fatalities were situated near the pump, and the mortality rate dropped quickly as one moved further away.

No description has been provided for this image

Figure 9 - Top-5 Pumps by Share of Cholera Deaths (Donut Chart)¶

This picture shows cholera mortality distribution in the five most affected water pumps in Soho during the 1854 outbreak. The donut chart displays five segments—one per pump—with slice size reflecting each pump’s percentage of total deaths. The Broad Street pump occupies the largest portion, reinforcing its role as the outbreak’s centre. The smaller slices for the other pumps indicate relatively fewer deaths, suggesting limited or secondary contamination near those locations.

No description has been provided for this image

Figure 10 - Daily Cholera Deaths (1854 Soho) Graph¶

This figure shows the day-by-day account of cholera deaths during the 1854 Soho epidemic. The line graph represents how deaths changed over time, making the outbreak’s rise and fall easy to see. A dashed vertical line marks 8 September 1854, the day the Broad Street pump handle—believed to be the infection source—was removed, a crucial step that led to a rapid decline in deaths. This visualization underscores Dr. John Snow’s water-source intervention.

No description has been provided for this image

Figure 11 – Interactive Cumulative Cholera Deaths Over Time¶

The total count of cholera deaths that took place during the 1854 Soho outbreak is presented in this interactive line chart. Every dot corresponds to the total death count until that day, thus giving a lucid depiction of the epidemic’s development. A red dashed vertical line denotes 8 September 1854, the day the handle of the pump at Broad Street was taken away. The slope of the line becomes less steep significantly after this event, reflecting a sudden drop in new deaths — strong visual evidence of Dr. John Snow's efficient intervention in stopping cholera’s spread.

Figure 12 - Correlation Between Distance from Broad Street Pump and Deaths¶

The figure represents the inverse relationship between the number of cholera deaths and the distance from the Broad Street pump in the 1854 Soho outbreak. Each dot corresponds to the number of deaths within a 50-metre band from the pump, and the red line shows a clear negative trend—indicating that the nearer people lived to the pump, the higher the death count. This visual evidence strongly supports Dr. John Snow’s hypothesis that cholera was transmitted through contaminated water.

No description has been provided for this image

Figure 13 - Correlation Matrix: Distance, Deaths, and Population Density¶

The heatmap depicted in this figure reveals the correlations between three variables: distance from the pump, mortality (deaths), and population density.
Colour intensity shows the strength and direction of each pairwise relationship.
A negative correlation between distance and deaths indicates that more people died closer to the pump, while a positive correlation between deaths and population density suggests that more densely populated areas were more affected.

No description has been provided for this image

Figure 14 - Kernel Density Heatmap of Cholera Deaths¶

Through this figure, the spatial distribution of cholera deaths in the Soho area during the 1854 outbreak is made clear.
Kernel density estimation highlights where death records concentrate — warm (red) zones represent more deaths per area unit.
The area around the Broad Street pump shows a clear concentration of deaths, supporting John Snow's conclusion that polluted water was the main source of the outbreak.

Out[25]:
Make this Notebook Trusted to load map: File -> Trust Notebook

Figure 15 - Forecast of Future Daily Cholera Deaths (with Confidence Intervals)¶

This illustration shows an ARIMA-based prediction of daily cholera fatalities for the ten days succeeding the noted 1854 Soho statistics.
The solid pink line shows the actual deaths, while the dashed blue line shows the forecasted deaths.
The light blue shaded band indicates the 95% confidence interval, reflecting the uncertainty around the prediction.

No description has been provided for this image

Figure 16 – Pump Voronoi Influence Zones with Deaths per Zone¶

Interactive map illustrating the Soho location divided into Voronoi areas, where every polygon is the area nearest to a certain water pump.
The color saturation of each zone reveals the number of cholera deaths reported in that area — darker colors signify higher death counts.
The Broad Street pump’s area stands out with the highest mortality, visually supporting Dr. John Snow’s conclusion that this pump was a main contributor to the cholera outbreak.

Out[27]:
Make this Notebook Trusted to load map: File -> Trust Notebook

Figure 17 – Pump Voronoi Influence Zones with Deaths per Zone¶

This animated map illustrating the timeline of cholera deaths in Soho. All the dots correspond to one or several documented deaths, while the evolving heat map conveys the outbreak's movement, peak, and decline in such time frames. Areas with the darkest and hottest colors denote the spots and times with the greatest death density near Broad Street.

The animation below illustrates the timeline of cholera deaths in Soho. All the dots correspond to one or several documented deaths, while the evolving heat map conveys the outbreak's movement, peak, and decline in such time frames. Areas with the darkest and hottest colors denote the spots and times with the greatest death density near Broad Street.

Out[28]:
Make this Notebook Trusted to load map: File -> Trust Notebook

Figure 18 – Animated Temporal Cholera Death Map¶

The below animation shows the time-line of cholera death in Soho. It works by giving each dot the meaning of one or more recorded deaths while the heat map that keeps changing is showing the flow of the disease from its peak to its decline over the different dates. The places and times of the highest death tolls are shown by the darker and warmer areas, thus exposing the time and place changes of the 1854 cholera plague.

Out[29]:
Make this Notebook Trusted to load map: File -> Trust Notebook

Iteration 4 – Fishnet Intensity Grid (Advanced Spatial Analysis)¶

During this stage, a spatial grid of 100 m × 100 m (fishnet) was constructed over the Soho area to calculate and display cholera deaths in a spatially structured manner.
Each grid cell represented a specific area and was coloured according to the number of reported deaths, with darker or warmer shades indicating higher mortality intensity.
This systematic display transforms discrete point data into a continuous spatial surface, allowing clearer identification of clustering patterns and reinforcing Dr. John Snow’s pinpointing of Broad Street as the outbreak’s focal point.
The grid-based visualisation therefore provides a more analytical perspective on spatial density, complementing the interactive timeline by showing how the disease’s severity was geographically concentrated within certain neighbourhood blocks.

Figure 19 – Street-Grid (Fishnet) Intensity Map of Cholera Deaths: The Soho region is divided into 100 m × 100 m cells; each cell’s colour encodes the number of deaths recorded within it. Darker/warmer shading indicates higher intensity. This grid-based view highlights the strong clustering around Broad Street, reinforcing Dr. John Snow’s finding about the contaminated pump being the outbreak’s focal point.

Out[31]:
Make this Notebook Trusted to load map: File -> Trust Notebook
4. Tools and Technologies

Tool / Library Purpose
0 Python 3 (Jupyter Notebook) Environment for code execution and visualization
1 Pandas Data manipulation, cleaning, and structuring
2 Folium Creation of interactive geographic and heat maps
3 HeatMapWithTime (folium.plugins) Animated temporal visualization of cholera spread
4 NumPy Mathematical operations and numerical computation
5 Matplotlib Static plotting and map overlays for data comp...

5. Design Rationale¶

The new design takes John Snow's original tale and adds modern reflections on time, place, and data, leading to a better understanding of the epidemic.
Every chart and map highlights a different aspect of the event, explaining the reasons, timing, and spatial focus of cholera deaths — and why the Broad Street pump became the outbreak’s main source.

Component Purpose Rationale
Animated Temporal Cholera Death Map (HeatMapWithTime) Show how deaths changed over time The timeline animation illustrates spread, peak, and decline day by day.
Fishnet Intensity Grid (100 m × 100 m) Quantify deaths per area cell Converts points into a measurable surface to highlight spatial hot spots, especially around Broad Street.
Voronoi Influence Zones (per Pump) Partition area by nearest pump Shows each pump’s catchment and total deaths per zone; Broad Street’s zone stands out, reinforcing source attribution.
Pump Markers Identify water source locations Links high-death zones to the Broad Street pump, supporting Snow’s conclusion.
Daily Deaths (Line Chart) + 7-day MA Show daily trends Shows rise and post–8 Sep decline; MA reduces noise for clearer trend.
Daily Attacks (Line Chart) + 7-day MA Track infection pressure Complements deaths by showing infection dynamics pre/post intervention.
Cumulative Deaths (Line / Interactive) Display total outbreak progress Highlights slope change after intervention; interactive version aids exploration.
Before vs After (Bar Charts) Compare intervention impact Clear drop in deaths and attacks after 8 Sep 1854.
Deaths by Nearest Pump (Bar Chart) Rank pump impact Quantifies Broad Street’s dominance relative to others.
Distance vs Deaths (Scatter + Trend) Relate deaths to pump distance Deaths decrease with distance from Broad Street; trendline makes it explicit.
Top-5 Pumps (Donut Chart) Show proportional deaths Quick view of shares across the major pumps.
Correlation Heatmap Explore relationships Summarises associations (e.g., deaths vs distance/density).
ARIMA Forecast (Line Chart) Illustrate prediction Simple time-series projection with uncertainty bands.

6. Reflection and Creativity¶

The redesign does not seek to improve upon John Snow's original plan; rather, it employs modern visualization tools to reconsider it. The project, through the use of temporal animation and interactive spatial layers, can indicate not only the outbreak's location but also the times when it was at its worst and when it was at its best.

A few of the innovations are listed below:

  • The turning of the historical data into an animated story of how the spread gradually increased and then disappeared, was done by heat maps that were based on the timeline.
  • Inviting people to explore the area and discover the places with the most crowd by using dynamic clustering and interactive signs.
  • Telling the stories with captions that guide the non-expert participants through the main points and historical background via the most important ideas.
  • There are multiple analytical layers like the Fishnet Grid, the Voronoi Zones, and the predictions that offer both creative and analytical insights into the data.

7. Conclusion¶

In the redesigned interactive visualization, John Snow's original cholera map is combined with modern analytical tools. Time heat maps, fishnet grids, Voronoi zones, and visual analytics such as graphs and pie charts work together to transform the historical map into a live, data-driven narrative. These elements allow us to see the distribution of cholera cases over both time and space, indicating that the Broad Street pump was the main source of the outbreak. This project demonstrates how interactive data visualization can revive historical data, providing deep insights into epidemiology, spatial reasoning, and public health storytelling.